Search CORE

University of St. Andrews - Pure

St Andrews Research Repository

A generic testing framework for agent-based simulation models

Author: Beck K
Burnstein I
C Bernon
Calvin W
Feathers M
Gürcan Ö
House R
Klügl F
Larman C
Law AM
MacQueen JB
Nguyen C
Nikolai C
O Dikenelli
Pidd M
Railsback SF
Schwindt P
Utting M
Wilensky U
Windrum P
Wolfram S
Ö Gürcan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

International audienceAgent-based modelling and simulation (ABMS) had an increasing attention during the last decade. However, the weak validation and verification of agent-based simulation models makes ABMS hard to trust. There is no comprehensive tool set for verification and validation of agent-based simulation models, which demonstrates that inaccuracies exist and/or reveals the existing errors in the model. Moreover, on the practical side, many ABMS frameworks are in use. In this sense, we designed and developed a generic testing framework for agent-based simulation models to conduct validation and verification of models. This paper presents our testing framework in detail and demonstrates its effectiveness by showing its applicability on a realistic agent-based simulation case study

Scientific Publications of the University of Toulouse II Le Mirail

Open Archive Toulouse Archive Ouverte

Ege University Institutional Repository

Analysis of multiplex gene expression maps obtained by voxelation

Author: AK Jain
B Albert
D Lin
D Liu
Desmond J Smith
G Kaiser
Hongbo Xie
JA Hartigan
JB MacQueen
Li An
Mark H Chin
MB Eisen
MH Chin
MS Boguski
PO Brown
RJ Lipshutz
RP Singh
Vasileios Megalooikonomou
VE Velculescu
VM Brown
VM Brown
Zoran Obradovic
Publication venue: BioMed Central
Publication date: 01/04/2009
Field of study

BackgroundGene expression signatures in the mammalian brain hold the key to understanding neural development and neurological disease. Researchers have previously used voxelation in combination with microarrays for acquisition of genome-wide atlases of expression patterns in the mouse brain. On the other hand, some work has been performed on studying gene functions, without taking into account the location information of a gene's expression in a mouse brain. In this paper, we present an approach for identifying the relation between gene expression maps obtained by voxelation and gene functions.ResultsTo analyze the dataset, we chose typical genes as queries and aimed at discovering similar gene groups. Gene similarity was determined by using the wavelet features extracted from the left and right hemispheres averaged gene expression maps, and by the Euclidean distance between each pair of feature vectors. We also performed a multiple clustering approach on the gene expression maps, combined with hierarchical clustering. Among each group of similar genes and clusters, the gene function similarity was measured by calculating the average gene function distances in the gene ontology structure. By applying our methodology to find similar genes to certain target genes we were able to improve our understanding of gene expression patterns and gene functions. By applying the clustering analysis method, we obtained significant clusters, which have both very similar gene expression maps and very similar gene functions respectively to their corresponding gene ontologies. The cellular component ontology resulted in prominent clusters expressed in cortex and corpus callosum. The molecular function ontology gave prominent clusters in cortex, corpus callosum and hypothalamus. The biological process ontology resulted in clusters in cortex, hypothalamus and choroid plexus. Clusters from all three ontologies combined were most prominently expressed in cortex and corpus callosum.ConclusionThe experimental results confirm the hypothesis that genes with similar gene expression maps might have similar gene functions. The voxelation data takes into account the location information of gene expression level in mouse brain, which is novel in related research. The proposed approach can potentially be used to predict gene functions and provide helpful suggestions to biologists

eScholarship - University of California

Structuring heterogeneous biological information using fuzzy clustering of k-partite graphs

Author: A Banerjee
A Clauset
A Misbahuddin
A Ruepp
AK Jain
AL Barabási
AN Langville
AP Erdös
B Long
CJ Sylvester
D Lee
D Lee
D Zhou
E Hüllermeier
E Ravasz
Fabian J Theis
Florian Blöchl
G Karypis
G Palla
H Cho
I Dhillon
J Bezdek
J Dunn
JB MacQueen
JB Pereira-Leal
K Devarajan
KI Goh
KV Mardia
M Barber
M Campos
M Fiorio
MA Yildirim
Mara L Hartsperger
N Gulbahce
P Paatero
P Wong
R Montanez
RC Samaco
RJ Shprintzen
RR Lebel
S Bauer
S Klamt
S Maslov
T Barnickel
Volker Stümpflen
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Extensive and automated data integration in bioinformatics facilitates the construction of large, complex biological networks. However, the challenge lies in the interpretation of these networks. While most research focuses on the unipartite or bipartite case, we address the more general but common situation of <it>k</it>-partite graphs. These graphs contain <it>k </it>different node types and links are only allowed between nodes of different types. In order to reveal their structural organization and describe the contained information in a more coarse-grained fashion, we ask how to detect clusters within each node type. Results Since entities in biological networks regularly have more than one function and hence participate in more than one cluster, we developed a <it>k</it>-partite graph partitioning algorithm that allows for overlapping (fuzzy) clusters. It determines for each node a degree of membership to each cluster. Moreover, the algorithm estimates a weighted <it>k</it>-partite graph that connects the extracted clusters. Our method is fast and efficient, mimicking the multiplicative update rules commonly employed in algorithms for non-negative matrix factorization. It facilitates the decomposition of networks on a chosen scale and therefore allows for analysis and interpretation of structures on various resolution levels. Applying our algorithm to a tripartite disease-gene-protein complex network, we were able to structure this graph on a large scale into clusters that are functionally correlated and biologically meaningful. Locally, smaller clusters enabled reclassification or annotation of the clusters' elements. We exemplified this for the transcription factor MECP2. Conclusions In order to cope with the overwhelming amount of information available from biomedical literature, we need to tackle the challenge of finding structures in large networks with nodes of multiple types. To this end, we presented a novel fuzzy <it>k</it>-partite graph partitioning algorithm that allows the decomposition of these objects in a comprehensive fashion. We validated our approach both on artificial and real-world data. It is readily applicable to any further problem.</p

PuSH

Building cooperation through health initiatives: an Arab and Israeli case study

Author: A Vass
Abi Sriharan
AM Noyek
Birn
DT Jameson
DT Jameson
G MacQueen
H Shahin
H Shahin
H Skinner
HA Amery
Harvey A Skinner
I Zwaenepoel
J Attias
J Lederach
J Santa Barbara
J Stein
JB Nadol Jr
M Manenti
P Scham
R Bader
R Garber
R Isralowitz
R Vardi-Saliternik
RK Yin
S Jabbour
S Wasserman
S Yusef
T Barnea
T Walsh
T Walsh
Publication venue: BioMed Central
Publication date: 01/07/2007
Field of study

Abstract Background Ongoing conflict in the Middle East poses a major threat to health and security. A project screening Arab and Israeli newborns for hearing loss provided an opportunity to evaluate ways for building cooperation. The aims of this study were to: a) examine what attracted Israeli, Jordanian and Palestinian participants to the project, b) describe challenges they faced, and c) draw lessons learned for guiding cross-border health initiatives. Methods A case study method was used involving 12 key informants stratified by country (3 Israeli, 3 Jordanian, 3 Palestinian, 3 Canadian). In-depth interviews were tape-recorded, transcribed and analyzed using an inductive qualitative approach to derive key themes. Results Major reasons for getting involved included: concern over an important health problem, curiosity about neighbors and opportunities for professional advancement. Participants were attracted to prospects for opening the dialogue, building relationships and facilitating cooperation in the region. The political situation was a major challenge that delayed implementation of the project and placed participants under social pressure. Among lessons learned, fostering personal relationships was viewed as critical for success of this initiative. Conclusion Arab and Israeli health professionals were prepared to get involved for two types of reasons: a) Project Level: opportunity to address a significant health issue (e.g. congenital hearing loss) while enhancing their professional careers, and b) Meta Level: concern about taking positive steps for building cooperation in the region. We invite discussion about roles that health professionals can play in building "cooperation networks" for underpinning health security, conflict resolution and global health promotion.</p

Comparison of scores for bimodality of gene expression distributions and genome-wide evaluation of the prognostic relevance of high-scoring genes

Author: A Ertel
AE Teschendorff
AM Kellerer
Birte Hellwig
C Desmedt
C Fraley
C Fraley
DS Srivastava
J Wang
JA Hartigan
Jan G Hengstler
JB MacQueen
JG Hengstler
JS Carroll
Jörg Rahnenführer
K Golka
L Gautier
M Mächler
M Schmidt
M Schmidt
M Schmidt
M Schmidt
Marcus Schmidt
Mathias C Gehrmann
N Mantel
R Tibshirani
RA Irizarry
RB Latta
RC Gentleman
RH Wong
S Holm
S Loi
S Loi
SA Tomlins
SA Tomlins
VN Kristensen
Wiebke Schormann
WJ Conover
Y Benjamini
Y Wang
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background A major goal of the analysis of high-dimensional RNA expression data from tumor tissue is to identify prognostic signatures for discriminating patient subgroups. For this purpose genome-wide identification of bimodally expressed genes from gene array data is relevant because distinguishability of high and low expression groups is easier compared to genes with unimodal expression distributions. Recently, several methods for the identification of genes with bimodal distributions have been introduced. A straightforward approach is to cluster the expression values and score the distance between the two distributions. Other scores directly measure properties of the distribution. The kurtosis, e.g., measures divergence from a normal distribution. An alternative is the outlier-sum statistic that identifies genes with extremely high or low expression values in a subset of the samples. Results We compare and discuss scores for bimodality for expression data. For the genome-wide identification of bimodal genes we apply all scores to expression data from 194 patients with node-negative breast cancer. Further, we present the first comprehensive genome-wide evaluation of the prognostic relevance of bimodal genes. We first rank genes according to bimodality scores and define two patient subgroups based on expression values. Then we assess the prognostic significance of the top ranking bimodal genes by comparing the survival functions of the two patient subgroups. We also evaluate the global association between the bimodal shape of expression distributions and survival times with an enrichment type analysis. Various cluster-based methods lead to a significant overrepresentation of prognostic genes. A striking result is obtained with the outlier-sum statistic (<it>p </it>< 10-12). Many genes with heavy tails generate subgroups of patients with different prognosis. Conclusions Genes with high bimodality scores are promising candidates for defining prognostic patient subgroups from expression data. We discuss advantages and disadvantages of the different scores for prognostic purposes. The outlier-sum statistic may be particularly valuable for the identification of genes to be included in prognostic signatures. Among the genes identified as bimodal in the breast cancer data set several have not yet previously been recognized to be prognostic and bimodally expressed in breast cancer.</p

Hochschulbibliothekszentrum des Landes Nordrhein-Westfalen (hbz)

Discovering local patterns of co - evolution: computational aspects and biological examples

Author: A Tanay
B Dujon
B Snel
C Goh
D Barker
D Barker
D Chamovitz
D Juan
D Ober
D Scannell
DM Krylov
DP Wall
E Oron
F Pazos
F Pazos
I Wapinski
J Wu
JB MacQueen
K Wolfe
LM o Rami'rez
M Benton
Martin Kupiec
O Man
P Jaccard
PM Bowers
R Chenna
R Singh
RL Tatusov
S Grossmann
S Ohno
T Przytycka
T Pupko
T Tuller
T Tuller
Tamir Tuller
TD Bie
Y Chena
Y Cheng
Yifat Felder
Z Yang
Z Yang
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Co-evolution is the process in which two (or more) sets of orthologs exhibit a similar or correlative pattern of evolution. Co-evolution is a powerful way to learn about the functional interdependencies between sets of genes and cellular functions and to predict physical interactions. More generally, it can be used for answering fundamental questions about the evolution of biological systems. Orthologs that exhibit a strong signal of co-evolution in a certain part of the evolutionary tree may show a mild signal of co-evolution in other branches of the tree. The major reasons for this phenomenon are noise in the biological input, genes that gain or lose functions, and the fact that some measures of co-evolution relate to rare events such as positive selection. Previous publications in the field dealt with the problem of finding sets of genes that co-evolved along an entire underlying phylogenetic tree, without considering the fact that often co-evolution is local. Results In this work, we describe a new set of biological problems that are related to finding patterns of <it>local </it>co-evolution. We discuss their computational complexity and design algorithms for solving them. These algorithms outperform other bi-clustering methods as they are designed specifically for solving the set of problems mentioned above. We use our approach to trace the co-evolution of fungal, eukaryotic, and mammalian genes at high resolution across the different parts of the corresponding phylogenetic trees. Specifically, we discover regions in the fungi tree that are enriched with positive evolution. We show that metabolic genes exhibit a remarkable level of co-evolution and different patterns of co-evolution in various biological datasets. In addition, we find that protein complexes that are related to gene expression exhibit non-homogenous levels of co-evolution across different parts of the <it>fungi </it>evolutionary line. In the case of mammalian evolution, signaling pathways that are related to <it>neurotransmission </it>exhibit a relatively higher level of co-evolution along the <it>primate </it>subtree. Conclusions We show that finding local patterns of co-evolution is a computationally challenging task and we offer novel algorithms that allow us to solve this problem, thus opening a new approach for analyzing the evolution of biological systems.</p

Public Library of Science (PLOS)

Differential Localization and Independent Acquisition of the H3K9me2 and H3K9me3 Chromatin Modifications in the Caenorhabditis elegans Adult Germ Line

Author: A Fire
A Gartner
A Kim
AF Dernburg
AH Peters
AJ MacQueen
AJ MacQueen
AM Khalil
AM Villeneuve
Anne M. Villeneuve
B Czermin
C Seum
CE Schaner
CJ Bean
CM Phillips
CM Phillips
CR Vakoc
CS Ketel
D Hansen
DC Schultz
DC Shakes
DS Fay
DT Stinchcomb
EC Andersen
EE Capowski
EM Maine
Erik C. Andersen
F Couteau
F Solari
G Poulin
Gregory P. Copenhaver
H Wang
J Hodgkin
J Muller
J Ouellet
J Polanowska
J Shi
J Yoon
JB Bessler
JC Rice
JC Rice
JC Yasuhara
Jessica B. Bessler
JR Whetstine
KC Reddy
L Yang
LB Bender
LB Bender
M Zetka
MK Montgomery
N Bhalla
P Kolasinska-Zwierz
P Trojer
S Ward
SA Broverman
SI Grewal
T Kouzarides
T Yuzyuk
TY Tzeng
V Reinke
WG Kelly
WG Kelly
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Histone methylation is a prominent feature of eukaryotic chromatin that modulates multiple aspects of chromosome function. Methyl modification can occur on several different amino acid residues and in distinct mono-, di-, and tri-methyl states. However, the interplay among these distinct modification states is not well understood. Here we investigate the relationships between dimethyl and trimethyl modifications on lysine 9 of histone H3 (H3K9me2 and H3K9me3) in the adult Caenorhabditis elegans germ line. Simultaneous immunofluorescence reveals very different temporal/spatial localization patterns for H3K9me2 and H3K9me3. While H3K9me2 is enriched on unpaired sex chromosomes and undergoes dynamic changes as germ cells progress through meiotic prophase, we demonstrate here that H3K9me3 is not enriched on unpaired sex chromosomes and localizes to all chromosomes in all germ cells in adult hermaphrodites and until the primary spermatocyte stage in males. Moreover, high-copy transgene arrays carrying somatic-cell specific promoters are highly enriched for H3K9me3 (but not H3K9me2) and correlate with DAPI-faint chromatin domains. We further demonstrate that the H3K9me2 and H3K9me3 marks are acquired independently. MET-2, a member of the SETDB histone methyltransferase (HMTase) family, is required for all detectable germline H3K9me2 but is dispensable for H3K9me3 in adult germ cells. Conversely, we show that the HMTase MES-2, an E(z) homolog responsible for H3K27 methylation in adult germ cells, is required for much of the germline H3K9me3 but is dispensable for H3K9me2. Phenotypic analysis of met-2 mutants indicates that MET-2 is nonessential for fertility but inhibits ectopic germ cell proliferation and contributes to the fidelity of chromosome inheritance. Our demonstration of the differential localization and independent acquisition of H3K9me2 and H3K9me3 implies that the trimethyl modification of H3K9 is not built upon the dimethyl modification in this context. Further, these and other data support a model in which these two modifications function independently in adult C. elegans germ cells

Identifying microRNA/mRNA dysregulations in ovarian cancer

Author: A Berchuck
A Laios
AC Society
AK Godwin
C Rosty
CA Wilson
CP Masamha
D Botta
D Spentzos
D Ye
D Zhang
EJ Nam
el SA Arafa
Gregory D Miles
Gunaretnam Rajagopal
Gyan Bhanot
H Donninger
H Liu
H Yang
HC Dan
I Hoffmann
I Tamm
JB MacQueen
JM Lancaster
K Macleod
K Pearson
K Selvendiran
KH Lu
L Zhang
LC Hartmann
Lorna Rodriguez
M Seiler
MC Todd
Michael Seiler
MV Iorio
N Dahiya
P Li
PO Humbert
R Chekerov
R Siegel
S Ma
SK Wyman
T Raemaekers
Publication venue: BioMed Central
Publication date: 01/01/2012
Field of study

Abstract Background MicroRNAs are a class of noncoding RNA molecules that co-regulate the expression of multiple genes via mRNA transcript degradation or translation inhibition. Since they often target entire pathways, they may be better drug targets than genes or proteins. MicroRNAs are known to be dysregulated in many tumours and associated with aggressive or poor prognosis phenotypes. Since they regulate mRNA in a tissue specific manner, their functional mRNA targets are poorly understood. In previous work, we developed a method to identify direct mRNA targets of microRNA using patient matched microRNA/mRNA expression data using an anti-correlation signature. This method, applied to clear cell Renal Cell Carcinoma (ccRCC), revealed many new regulatory pathways compromised in ccRCC. In the present paper, we apply this method to identify dysregulated microRNA/mRNA mechanisms in ovarian cancer using data from The Cancer Genome Atlas (TCGA). Methods TCGA Microarray data was normalized and samples whose class labels (tumour or normal) were ambiguous with respect to consensus ensemble K-Means clustering were removed. Significantly anti-correlated and correlated genes/microRNA differentially expressed between tumour and normal samples were identified. TargetScan was used to identify gene targets of microRNA. Results We identified novel microRNA/mRNA mechanisms in ovarian cancer. For example, the expression level of RAD51AP1 was found to be strongly anti-correlated with the expression of hsa-miR-140-3p, which was significantly down-regulated in the tumour samples. The anti-correlation signature was present separately in the tumour and normal samples, suggesting a direct causal dysregulation of RAD51AP1 by hsa-miR-140-3p in the ovary. Other pairs of potentially biological relevance include: hsa-miR-145/E2F3, hsa-miR-139-5p/TOP2A, and hsa-miR-133a/GCLC. We also identified sets of positively correlated microRNA/mRNA pairs that are most likely result from indirect regulatory mechanisms. Conclusions Our findings identify novel microRNA/mRNA relationships that can be verified experimentally. We identify both generic microRNA/mRNA regulation mechanisms in the ovary as well as specific microRNA/mRNA controls which are turned on or off in ovarian tumours. Our results suggest that the disease process uses specific mechanisms which may be significant for their utility as early detection biomarkers or in the development of microRNA therapies in treating ovarian cancers. The positively correlated microRNA/mRNA pairs suggest the existence of novel regulatory mechanisms that proceed via intermediate states (indirect regulation) in ovarian tumorigenesis.</p

Elsevier - Publisher Connector

Misty Mountain clustering: application to fast unsupervised flow cytometry gating

Author: A Cuevas
A Cuevas
AP Dempster
B Scholkopf
BJ Frey
C Fraley
CJC Burges
CW Morris
D Stauffer
G Celeux
G Cornuejols
G Lizard
G Schwarz
GC Tseng
GEP Box
GJ McLachlan
H Hotelling
István P Sugár
J Hoshen
JA Hartigan
JB MacQueen
K Lo
K Lo
KH Knuth
L Boddy
L Boddy
L Breiman
LJ Heyer
M Fiedler
MB Eisen
MF Wilkins
MP Wand
PJ Rousseeuw
PO Krutzik
R Kothari
RF Murphy
RJ Beckman
RL Boyell
RR Brinkman
S Demers
S Kirkpatrick
S Pyne
Stuart C Sealfon
TC Bakker Schut
W Feller
W Jang
W Jang
WE Donath
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background There are many important clustering questions in computational biology for which no satisfactory method exists. Automated clustering algorithms, when applied to large, multidimensional datasets, such as flow cytometry data, prove unsatisfactory in terms of speed, problems with local minima or cluster shape bias. Model-based approaches are restricted by the assumptions of the fitting functions. Furthermore, model based clustering requires serial clustering for all cluster numbers within a user defined interval. The final cluster number is then selected by various criteria. These supervised serial clustering methods are time consuming and frequently different criteria result in different optimal cluster numbers. Various unsupervised heuristic approaches that have been developed such as affinity propagation are too expensive to be applied to datasets on the order of 106 points that are often generated by high throughput experiments. Results To circumvent these limitations, we developed a new, unsupervised density contour clustering algorithm, called Misty Mountain, that is based on percolation theory and that efficiently analyzes large data sets. The approach can be envisioned as a progressive top-down removal of clouds covering a data histogram relief map to identify clusters by the appearance of statistically distinct peaks and ridges. This is a parallel clustering method that finds every cluster after analyzing only once the cross sections of the histogram. The overall run time for the composite steps of the algorithm increases linearly by the number of data points. The clustering of 106 data points in 2D data space takes place within about 15 seconds on a standard laptop PC. Comparison of the performance of this algorithm with other state of the art automated flow cytometry gating methods indicate that Misty Mountain provides substantial improvements in both run time and in the accuracy of cluster assignment. Conclusions Misty Mountain is fast, unbiased for cluster shape, identifies stable clusters and is robust to noise. It provides a useful, general solution for multidimensional clustering problems. We demonstrate its suitability for automated gating of flow cytometry data.</p